query ratio
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > United States > Texas (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- (7 more...)
- Media (0.70)
- Leisure & Entertainment (0.47)
- Information Technology (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.86)
- (2 more...)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > United States > Texas (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- (7 more...)
- Media (0.70)
- Leisure & Entertainment (0.47)
- Information Technology (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.86)
- (2 more...)
SYNAPSE-G: Bridging Large Language Models and Graph Learning for Rare Event Classification
Tavakkol, Sasan, Chen, Lin, Springer, Max, Schantz, Abigail, Bratanič, Blaž, Cohen-Addad, Vincent, Bateni, MohammadHossein
Scarcity of labeled data, especially for rare events, hinders training effective machine learning models. This paper proposes SYNAPSE-G (Synthetic Augmentation for Positive Sampling via Expansion on Graphs), a novel pipeline leveraging Large Language Models (LLMs) to generate synthetic training data for rare event classification, addressing the cold-start problem. This synthetic data serve as seeds for semi-supervised label propagation on a similarity graph constructed between the seeds and a large unlabeled dataset. This identifies candidate positive examples, subsequently labeled by an oracle (human or LLM). The expanded dataset then trains/fine-tunes a classifier. We theoretically analyze how the quality (validity and diversity) of the synthetic data impacts the precision and recall of our method. Experiments on the imbalanced SST2 and MHS datasets demonstrate SYNAPSE-G's effectiveness in finding positive labels, outperforming baselines including nearest neighbor search.
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.35)
Neural Network Approximators for Marginal MAP in Probabilistic Circuits
Arya, Shivvrat, Rahman, Tahrima, Gogate, Vibhav
Probabilistic circuits (PCs) such as sum-product networks efficiently represent large multi-variate probability distributions. They are preferred in practice over other probabilistic representations such as Bayesian and Markov networks because PCs can solve marginal inference (MAR) tasks in time that scales linearly in the size of the network. Unfortunately, the maximum-a-posteriori (MAP) and marginal MAP (MMAP) tasks remain NP-hard in these models. Inspired by the recent work on using neural networks for generating near-optimal solutions to optimization problems such as integer linear programming, we propose an approach that uses neural networks to approximate (M)MAP inference in PCs. The key idea in our approach is to approximate the cost of an assignment to the query variables using a continuous multilinear function, and then use the latter as a loss function. The two main benefits of our new method are that it is self-supervised and after the neural network is learned, it requires only linear time to output a solution. We evaluate our new approach on several benchmark datasets and show that it outperforms three competing linear time approximations, max-product inference, max-marginal inference and sequential estimation, which are used in practice to solve MMAP tasks in PCs.
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > Texas (0.04)
- (6 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)